Amazon SageMaker AI

8 posts

In this post, we discuss the challenges faced by organizations when updating models in production. Then we deep dive into the new rolling update feature for inference components and provide practical examples using DeepSeek distilled models to demonstrate this feature. Finally, we explore how to set up rolling updates in different scenarios.

Melanie Li3/25/2025

Amazon Web Services

Amazon SageMaker AI Announcements Artificial Intelligence AWS Inferentia AWS Neuron AWS Trainium

How to run Qwen 2.5 on AWS AI chips using Hugging Face libraries

In this post, we outline how to get started with deploying the Qwen 2.5 family of models on an Inferentia instance using Amazon Elastic Compute Cloud (Amazon EC2) and Amazon SageMaker using the Hugging Face Text Generation Inference (TGI) container and the Hugging Face Optimum Neuron library. Qwen2.5 Coder and Math variants are also supported.

Jim Burtoft3/13/2025

Amazon Web Services

Amazon SageMaker AI Artificial Intelligence Foundation models Generative AI

Optimize hosting DeepSeek-R1 distilled models with Hugging Face TGI on Amazon SageMaker AI

In this post, we demonstrate how to optimize hosting DeepSeek-R1 distilled models with Hugging Face Text Generation Inference (TGI) on Amazon SageMaker AI.

Pranav Murthy3/13/2025

Amazon Web Services

Amazon SageMaker AI Foundation models Generative AI

Deploy DeepSeek-R1 distilled models on Amazon SageMaker using a Large Model Inference container

Deploying DeepSeek models on SageMaker AI provides a robust solution for organizations seeking to use state-of-the-art language models in their applications. In this post, we show how to use the distilled models in SageMaker AI, which offers several options to deploy the distilled versions of the R1 model.

Dmitry Soldatkin3/11/2025

Amazon Web Services

Amazon SageMaker AI Amazon SageMaker Studio Artificial Intelligence Best Practices Intermediate (200)Technical How-to

Time series forecasting with LLM-based foundation models and scalable AIOps on AWS

In this blog post, we will guide you through the process of integrating Chronos into Amazon SageMaker Pipeline using a synthetic dataset that simulates a sales forecasting scenario, unlocking accurate and efficient predictions with minimal data.

Nick Biso3/5/2025

Amazon Web Services

Advanced (300)Amazon Machine Learning Amazon SageMaker AI Amazon SageMaker HyperPod Best Practices Generative AI

Customize DeepSeek-R1 distilled models using Amazon SageMaker HyperPod recipes – Part 1

In this two-part series, we discuss how you can reduce the DeepSeek model customization complexity by using the pre-built fine-tuning workflows (also called “recipes”) for both DeepSeek-R1 model and its distilled variations, released as part of Amazon SageMaker HyperPod recipes. In this first post, we will build a solution architecture for fine-tuning DeepSeek-R1 distilled models and demonstrate the approach by providing a step-by-step example on customizing the DeepSeek-R1 Distill Qwen 7b model using recipes, achieving an average of 25% on all the Rouge scores, with a maximum of 49% on Rouge 2 score with both SageMaker HyperPod and SageMaker training jobs. The second part of the series will focus on fine-tuning the DeepSeek-R1 671b model itself.

Kanwaljit Khurmi3/3/2025

Amazon Web Services

Amazon SageMaker AI Amazon SageMaker Lakehouse Best Practices Technical How-to AI/ML Amazon SageMaker MLOps

Governing the ML lifecycle at scale, Part 4: Scaling MLOps with security and governance controls

This post provides detailed steps for setting up the key components of a multi-account ML platform. This includes configuring the ML Shared Services Account, which manages the central templates, model registry, and deployment pipelines; sharing the ML Admin and SageMaker Projects Portfolios from the central Service Catalog; and setting up the individual ML Development Accounts where data scientists can build and train models.

Jia (Vivian) Li2/7/2025

amazon-web-services

Amazon Machine Learning Amazon SageMaker Amazon SageMaker AI Innovation and Reinvention Intermediate (200)Monitoring and observability Python Security Security & Governance Technical How-to

Efficiently build and tune custom log anomaly detection models with Amazon SageMaker

In this post, we walk you through the process to build an automated mechanism using Amazon SageMaker to process your log data, run training iterations over it to obtain the best-performing anomaly detection model, and register it with the Amazon SageMaker Model Registry for your customers to use it.

Nitesh Sehwani1/6/2025